Scalable Transaction Management with Serializable Snapshot Isolation on HBase

نویسندگان

  • Vinit Padhye
  • Anand Tripathi
چکیده

Key-value based data storage systems such as HBase and Bigtable provide high scalability and availability compared to traditional relational databases. However, unlike relational databases, the existing key-value stores provide only limited transactional functionality, such as single-row transactions. In this paper, we address the problem of building scalable transaction management mechanisms for multirow ACID transactions on data storage systems such as HBase. To support scalability and availability, the transaction management functions are decoupled from the data storage system. Furthermore, our design does not depend on any central transaction management layer, instead these functions and the related recovery actions are performed in a decentralized and cooperative manner by the application level processes executing the transactions. Our protocol uses snapshot-isolation model as it provides more concurrency. Since the basic snapshot isolation model does not guarantee serializability, our protocol uses a technique based on identifying dependency cycles amongst transactions to avoid serialization anomalies. The protocol for supporting serializability is also performed in a decentralized manner. We demonstrate the scalability and robustness of our approach as well as the correctness of the protocol in ensuring serializable executions of transactions on HBase. We also present and evaluate an alternative approach based on a hybrid model where certain functions, such as conflict detection, are performed by a dedicated service.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

HBaseSI: Multi-row Distributed Transactions with Global Strong Snapshot Isolation on Clouds

This paper presents the “HBaseSI” client library, which provides global strong snapshot isolation (SI) for multi-row distributed transactions in HBase. This is the first strong SI mechanism developed for HBase. HBaseSI uses novel methods in handling distributed transactional management autonomously by individual clients. These methods greatly simplify the design of HBaseSI and can be generalize...

متن کامل

Towards serializable replication with snapshot isolation

Replicated database systems necessarily deal with multiple versions of data items active concurrently across nodes in a replication group. As a consequence, there is a natural fit between replication and snapshot isolation (SI), which uses multiple versions of data within a single site to provide nonblocking read operations. However, snapshot isolation does not guarantee serializable execution ...

متن کامل

Serializable Snapshot Isolation in Shared-Nothing, Distributed Database Management Systems

NoSQL data storage systems provide high scalability and availability in exchange for limited transactional guarantees. In many cases, however, an application cannot give up transactional support but still needs the scalability provided by such systems. One approach for overcoming this limitation is to implement Snapshot Isolation (SI) on top of these systems. SI prevents most non-serializable e...

متن کامل

Serializable Snapshot Isolation for Replicated Databases in High-Update Scenarios

Many proposals for managing replicated data use sites running the Snapshot Isolation (SI) concurrency control mechanism, and provide 1-copy SI or something similar, as the global isolation level. This allows good scalability, since only ww-conflicts need to be managed globally. However, 1-copy SI can lead to data corruption and violation of integrity constraints [5]. 1-copy serializability is t...

متن کامل

Modelling Snapshot Isolation Performance

Snapshot Isolation (SI) level is extensively used in commercial database systems. We developed a simple SI implementation protocol for distributed DBMS and implemented it in the Apache HBase. The work presents the performance evaluation of the protocol. We have measured the performance of a single-node system and modeled the performance of a distributed HBase cluster.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012